TB-WPRO: Title-Block Based Web Page Reorganization

نویسندگان

  • Qihua Chen
  • Xiangdong Wang
  • Yueliang Qian
چکیده

For cell phone users and blind people using non-visual browsers, browsing Web by common browsers is quite inefficient due to the problem of information overload. This paper presents the TB-WPRO (TitleBlock based Web Page Re-Organization) method, which hierarchically segments web pages into blocks using visual and layout information reflecting the web designers’ intent. TB-WPRO segments the web pages with a clear goal to extract self-described title blocks. To reorganize web pages, the segmentation result is transformed to a serial of small web pages that could be easily accessed. Compared to current methods, the proposed approach obtains a promising segmentation result where blocks are visually and semantically consistent with original web pages. DOI: 10.4018/978-1-4666-2645-4.ch007

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Title-Block Based Web Page Reorganization

For cell phone users and blind people using non-visual browsers, browsing Web by common browsers is quite inefficient due to the problem of information overload. This paper presents the TB-WPRO (Title-Block based Web Page Re-Organization) method, which hierarchically segments web pages into blocks using visual and layout information reflecting the web designers’ intent. TB-WPRO segments the web...

متن کامل

TB - WPRO : Title - Block Based Web

For cell phone users and blind people using non-visual browsers, browsing Web by common browsers is quite inefficient due to the problem of information overload. This paper presents the TB-WPRO (TitleBlock based Web Page Re-Organization) method, which hierarchically segments web pages into blocks using visual and layout information reflecting the web designers’ intent. TB-WPRO segments the web ...

متن کامل

A Web Page Segmentation Method by using Headlines to Web Contents as Separators and its Evaluations

In this paper, we describe a Web page segmentation method based on title blocks and show its evaluation. Title blocks are minimum blocks that function as headlines for specific Web content. A typical Web page consists of multiple elements with different types of features, such as main content, navigation panels, copyright and privacy notices, and advertisements. Web page segmentation is the div...

متن کامل

Using Document Structure on Retrieving Webpages at the Web-CLEF 2006

We present a report on our participation in the mixed monolingual web task of the 2006 Cross-Language Evaluation Forum (CLEF). We compared the result of web page retrieval based on the page content, page title, and anchor page. The retrieval effectiveness for the combination of page content, page title, and anchor texts was better than that of the combination of page title and page title only. ...

متن کامل

Using Web Page Titles to Rediscover Lost Web Pages

Titles are denoted by the TITLE element within a web page. We queried the title against the the Yahoo search engine to determine the page’s status (found, not found). We conducted several tests based on elements of the title. These tests were used to discern whether we could predict a pages status based on the title. Our results increase our ability to determine bad titles but not our ability t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • IJAPUC

دوره 3  شماره 

صفحات  -

تاریخ انتشار 2011